{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "name": "Question_Answering_Squad.ipynb",
      "provenance": [],
      "private_outputs": true,
      "collapsed_sections": [
        "daYw_Xll2ZR9"
      ],
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.6"
    },
    "pycharm": {
      "stem_cell": {
        "cell_type": "raw",
        "metadata": {
          "collapsed": false
        },
        "source": []
      }
    }
  },
  "cells": [
    {
      "cell_type": "code",
      "metadata": {
        "id": "uRLPr0TnIAHO"
      },
      "source": [
        "BRANCH = 'v1.0.0'"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "o_0K1lsW1dj9"
      },
      "source": [
        "\"\"\"\n",
        "You can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.\n",
        "\n",
        "Instructions for setting up Colab are as follows:\n",
        "1. Open a new Python 3 notebook.\n",
        "2. Import this notebook from GitHub (File -> Upload Notebook -> \"GITHUB\" tab -> copy/paste GitHub URL)\n",
        "3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select \"GPU\" for hardware accelerator)\n",
        "4. Run this cell to set up dependencies.\n",
        "\"\"\"\n",
        "# If you're using Google Colab and not running locally, run this cell\n",
        "\n",
        "# install NeMo\n",
        "!python -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[nlp]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "dzqD2WDFOIN-"
      },
      "source": [
        "from nemo.utils.exp_manager import exp_manager\n",
        "from nemo.collections import nlp as nemo_nlp\n",
        "\n",
        "import os\n",
        "import wget \n",
        "import torch\n",
        "import pytorch_lightning as pl\n",
        "from omegaconf import OmegaConf"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "daYw_Xll2ZR9"
      },
      "source": [
        "# Task Description\n",
        "Given a question and a context both in natural language, predict the span within the context with a start and end position which indicates the answer to the question.\n",
        "For every word in our training dataset we’re going to predict:\n",
        "- likelihood this word is the start of the span \n",
        "- likelihood this word is the end of the span \n",
        "\n",
        "We are using a pretrained [BERT](https://arxiv.org/pdf/1810.04805.pdf) encoder with 2 span prediction heads for prediction start and end position of the answer. The span predictions are token classifiers consisting of a single linear layer. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZnuziSwJ1yEB"
      },
      "source": [
        "# Dataset\n",
        "This model expects the dataset to be in [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) format, e.g. a JSON file for each dataset split. \n",
        "In the following we will show example for a training file. Each title has one or multiple paragraph entries, each consisting of the text - \"context\", and question-answer entries. Each question-answer entry has:\n",
        "* a question\n",
        "* a globally unique id\n",
        "* a boolean flag \"is_impossible\" which shows if the question is answerable or not\n",
        "* in case the question is answerable one answer entry, which contains the text span and its starting character index in the context. If not answerable, the \"answers\" list is empty\n",
        "\n",
        "The evaluation files (for validation and testing) follow the above format except for it can provide more than one answer to the same question. \n",
        "The inference file follows the above format except for it does not require the \"answers\" and \"is_impossible\" keywords.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TXFORGBv2Jqu"
      },
      "source": [
        "\n",
        "\n",
        "```\n",
        "{\n",
        "    \"data\": [\n",
        "        {\n",
        "            \"title\": \"Super_Bowl_50\", \n",
        "            \"paragraphs\": [\n",
        "                {\n",
        "                    \"context\": \"Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24\\u201310 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the \\\"golden anniversary\\\" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as \\\"Super Bowl L\\\"), so that the logo could prominently feature the Arabic numerals 50.\", \n",
        "                    \"qas\": [\n",
        "                        {\n",
        "                            \"question\": \"Where did Super Bowl 50 take place?\", \n",
        "                            \"is_impossible\": \"false\", \n",
        "                            \"id\": \"56be4db0acb8001400a502ee\", \n",
        "                            \"answers\": [\n",
        "                                {\n",
        "                                    \"answer_start\": \"403\", \n",
        "                                    \"text\": \"Santa Clara, California\"\n",
        "                                }\n",
        "                            ]\n",
        "                        },\n",
        "                        {\n",
        "                            \"question\": \"What was the winning score of the Super Bowl 50?\", \n",
        "                            \"is_impossible\": \"true\", \n",
        "                            \"id\": \"56be4db0acb8001400a502ez\", \n",
        "                            \"answers\": [\n",
        "                            ]\n",
        "                        }\n",
        "                    ]\n",
        "                }\n",
        "            ]\n",
        "        }\n",
        "    ]\n",
        "}\n",
        "...\n",
        "```\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SL58EWkd2ZVb"
      },
      "source": [
        "## Download the data"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "THi6s1Qx2G1k"
      },
      "source": [
        "In this notebook we are going download the [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) dataset to showcase how to do training and inference. There are two datasets, SQuAD1.0 and SQuAD2.0. SQuAD 1.1, the previous version of the SQuAD dataset, contains 100,000+ question-answer pairs on 500+ articles. SQuAD2.0 dataset combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. \n",
        "\n",
        "\n",
        "To download both datasets, we use  [NeMo/examples/nlp/question_answering/get_squad.py](https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/question_answering/get_squad.py). \n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "tv3qXTTR_hBk"
      },
      "source": [
        "# set the following paths\n",
        "DATA_DIR = \"PATH_TO_DATA\"\n",
        "WORK_DIR = \"PATH_TO_CHECKPOINTS_AND_LOGS\""
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "qcz3Djem_hBn"
      },
      "source": [
        "## download get_squad.py script to download and preprocess the SQuAD data\n",
        "os.makedirs(WORK_DIR, exist_ok=True)\n",
        "if not os.path.exists(WORK_DIR + '/get_squad.py'):\n",
        "    print('Downloading get_squad.py...')\n",
        "    wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/question_answering/get_squad.py', WORK_DIR)\n",
        "else:\n",
        "    print ('get_squad.py already exists')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mpzsC41t_hBq"
      },
      "source": [
        "# download and preprocess the data\n",
        "! python $WORK_DIR/get_squad.py --destDir $DATA_DIR"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "m_HLLl6t_hBs"
      },
      "source": [
        "after execution of the above cell, your data folder will contain a subfolder \"squad\" the following 4 files for training and evaluation\n",
        "- v1.1/train-v1.1.json\n",
        "- v1.1/dev-v1.1.json\n",
        "- v2.0/train-v2.0.json\n",
        "- v2.0/dev-v2.0.json"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "qYHcfxPL_hBt"
      },
      "source": [
        "! ls -LR {DATA_DIR}/squad"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bdpikZVreLlI"
      },
      "source": [
        "## Data preprocessing\n",
        "\n",
        "The input into the model is the concatenation of two tokenized sequences:\n",
        "\" [CLS] query [SEP] context [SEP]\".\n",
        "This is the tokenization used for BERT, i.e. [WordPiece](https://arxiv.org/pdf/1609.08144.pdf) Tokenizer, which uses the [Google's BERT vocabulary](https://github.com/google-research/bert). This tokenizer is configured with `model.tokenizer.tokenizer_name=bert-base-uncased` and is automatically instantiated using [Huggingface](https://huggingface.co/)'s API. \n",
        "The benefit of this tokenizer is that this is compatible with a pretrained BERT model, from which we can finetune instead of training the question answering model from scratch. However, we also support other tokenizers, such as `model.tokenizer.tokenizer_name=sentencepiece`. Unlike the BERT WordPiece tokenizer, the [SentencePiece](https://github.com/google/sentencepiece) tokenizer model needs to be first created from a text file.\n",
        "See [02_NLP_Tokenizers.ipynb](https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/nlp/02_NLP_Tokenizers.ipynb) for more details on how to use NeMo Tokenizers."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0q7Y7nyW_hBv"
      },
      "source": [
        "# Data and Model Parameters\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "B0b0Tn8M_hBv"
      },
      "source": [
        "Note, this is only an example to showcase usage and is not optimized for accuracy. In the following, we will download and adjust the model configuration to create a toy example, where we only use a small fraction of the original dataset. \n",
        "\n",
        "In order to train the full SQuAD model, leave the model parameters from the configuration file unchanged. This sets NUM_SAMPLES=-1 to use the entire dataset, which will slow down performance significantly. We recommend to use bash script and multi-GPU to accelerate this. \n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "n8HZrDmr12_-"
      },
      "source": [
        "# This is the model configuration file that we will download, do not change this\n",
        "MODEL_CONFIG = \"question_answering_squad_config.yaml\"\n",
        "\n",
        "# model parameters, play with these\n",
        "BATCH_SIZE = 12\n",
        "MAX_SEQ_LENGTH = 384\n",
        "# specify BERT-like model, you want to use\n",
        "PRETRAINED_BERT_MODEL = \"bert-base-uncased\"\n",
        "TOKENIZER_NAME = \"bert-base-uncased\" # tokenizer name\n",
        "\n",
        "# Number of data examples used for training, validation, test and inference\n",
        "TRAIN_NUM_SAMPLES = VAL_NUM_SAMPLES = TEST_NUM_SAMPLES = 5000 \n",
        "INFER_NUM_SAMPLES = 5\n",
        "\n",
        "TRAIN_FILE = f\"{DATA_DIR}/squad/v1.1/train-v1.1.json\"\n",
        "VAL_FILE = f\"{DATA_DIR}/squad/v1.1/dev-v1.1.json\"\n",
        "TEST_FILE = f\"{DATA_DIR}/squad/v1.1/dev-v1.1.json\"\n",
        "INFER_FILE = f\"{DATA_DIR}/squad/v1.1/dev-v1.1.json\"\n",
        "\n",
        "INFER_PREDICTION_OUTPUT_FILE = \"output_prediction.json\"\n",
        "INFER_NBEST_OUTPUT_FILE = \"output_nbest.json\"\n",
        "\n",
        "# training parameters\n",
        "LEARNING_RATE = 0.00003\n",
        "\n",
        "# number of epochs\n",
        "MAX_EPOCHS = 1"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "daludzzL2Jba"
      },
      "source": [
        "# Model Configuration"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_whKCxfTMo6Y"
      },
      "source": [
        "The model is defined in a config file which declares multiple important sections. They are:\n",
        "- **model**: All arguments that will relate to the Model - language model, span prediction, optimizer and schedulers, datasets and any other related information\n",
        "\n",
        "- **trainer**: Any argument to be passed to PyTorch Lightning"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "T1gA8PsJ13MJ"
      },
      "source": [
        "# download the model's default configuration file \n",
        "config_dir = WORK_DIR + '/configs/'\n",
        "os.makedirs(config_dir, exist_ok=True)\n",
        "if not os.path.exists(config_dir + MODEL_CONFIG):\n",
        "    print('Downloading config file...')\n",
        "    wget.download(f'https://raw.githubusercontent.com/NVIDIA/NeMo/{BRANCH}/examples/nlp/question_answering/conf/{MODEL_CONFIG}', config_dir)\n",
        "else:\n",
        "    print ('config file is already exists')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mX3KmWMvSUQw"
      },
      "source": [
        "# this line will print the entire default config of the model\n",
        "config_path = f'{WORK_DIR}/configs/{MODEL_CONFIG}'\n",
        "print(config_path)\n",
        "config = OmegaConf.load(config_path)\n",
        "print(OmegaConf.to_yaml(config))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZCgWzNBkaQLZ"
      },
      "source": [
        "## Setting up data within the config\n",
        "\n",
        "Among other things, the config file contains dictionaries called dataset, train_ds and validation_ds, test_ds. These are configurations used to setup the Dataset and DataLoaders of the corresponding config.\n",
        "\n",
        "Specify data paths using `model.train_ds.file`, `model.valuation_ds.file` and `model.test_ds.file`.\n",
        "\n",
        "Let's now add the data paths to the config."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "LQHCJN-ZaoLp"
      },
      "source": [
        "config.model.train_ds.file = TRAIN_FILE\n",
        "config.model.validation_ds.file = VAL_FILE\n",
        "config.model.test_ds.file = TEST_FILE\n",
        "\n",
        "config.model.train_ds.num_samples = TRAIN_NUM_SAMPLES\n",
        "config.model.validation_ds.num_samples = VAL_NUM_SAMPLES\n",
        "config.model.test_ds.num_samples = TEST_NUM_SAMPLES\n",
        "\n",
        "config.model.tokenizer.tokenizer_name = TOKENIZER_NAME"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nB96-3sTc3yk"
      },
      "source": [
        "# Building the PyTorch Lightning Trainer\n",
        "\n",
        "NeMo models are primarily PyTorch Lightning modules - and therefore are entirely compatible with the PyTorch Lightning ecosystem!\n",
        "\n",
        "Let's first instantiate a Trainer object!"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "knF6QeQQdMrH"
      },
      "source": [
        "# lets modify some trainer configs\n",
        "# checks if we have GPU available and uses it\n",
        "cuda = 1 if torch.cuda.is_available() else 0\n",
        "config.trainer.gpus = cuda\n",
        "config.trainer.precision = 16 if torch.cuda.is_available() else 32\n",
        "\n",
        "# For mixed precision training, use precision=16 and amp_level=O1\n",
        "\n",
        "config.trainer.max_epochs = MAX_EPOCHS\n",
        "\n",
        "# Remove distributed training flags if only running on a single GPU or CPU\n",
        "config.trainer.accelerator = None\n",
        "\n",
        "print(\"Trainer config - \\n\")\n",
        "print(OmegaConf.to_yaml(config.trainer))\n",
        "\n",
        "trainer = pl.Trainer(**config.trainer)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8IlEMdVxdr6p"
      },
      "source": [
        "# Setting up a NeMo Experiment¶\n",
        "\n",
        "NeMo has an experiment manager that handles logging and checkpointing for us, so let's use it!"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8uztqGAmdrYt"
      },
      "source": [
        "config.exp_manager.exp_dir = WORK_DIR\n",
        "exp_dir = exp_manager(trainer, config.get(\"exp_manager\", None))\n",
        "\n",
        "# the exp_dir provides a path to the current experiment for easy access\n",
        "exp_dir = str(exp_dir)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "D4jy28fbjekD"
      },
      "source": [
        "# Using an Out-Of-Box Model"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Ins2ZzJckKKo"
      },
      "source": [
        "# list available pretrained models\n",
        "nemo_nlp.models.QAModel.list_available_models()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "iFnzHvkVk-S5"
      },
      "source": [
        "# load pretained model\n",
        "pretrained_model_name=\"qa_squadv1.1_bertbase\"\n",
        "model = nemo_nlp.models.QAModel.from_pretrained(model_name=pretrained_model_name)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6FI_nQsJo_11"
      },
      "source": [
        "# Model Training"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8tjLhUvL_o7_"
      },
      "source": [
        "Before initializing the model, we might want to modify some of the model configs."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Xeuc2i7Y_nP5"
      },
      "source": [
        "# complete list of supported BERT-like models\n",
        "nemo_nlp.modules.get_pretrained_lm_models_list()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "RK2xglXyAUOO"
      },
      "source": [
        "# add the specified above model parameters to the config\n",
        "config.model.language_model.pretrained_model_name = PRETRAINED_BERT_MODEL\n",
        "config.model.train_ds.batch_size = BATCH_SIZE\n",
        "config.model.validation_ds.batch_size  = BATCH_SIZE\n",
        "config.model.test_ds.batch_size = BATCH_SIZE\n",
        "config.model.optim.lr = LEARNING_RATE\n",
        "\n",
        "print(\"Updated model config - \\n\")\n",
        "print(OmegaConf.to_yaml(config.model))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "NgsGLydWo-6-"
      },
      "source": [
        "# initialize the model\n",
        "# dataset we'll be prepared for training and evaluation during\n",
        "model = nemo_nlp.models.QAModel(cfg=config.model, trainer=trainer)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kQ592Tx4pzyB"
      },
      "source": [
        "## Monitoring Training Progress\n",
        "Optionally, you can create a Tensorboard visualization to monitor training progress."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mTJr16_pp0aS"
      },
      "source": [
        "try:\n",
        "  from google import colab\n",
        "  COLAB_ENV = True\n",
        "except (ImportError, ModuleNotFoundError):\n",
        "  COLAB_ENV = False\n",
        "\n",
        "# Load the TensorBoard notebook extension\n",
        "if COLAB_ENV:\n",
        "  %load_ext tensorboard\n",
        "  %tensorboard --logdir {exp_dir}\n",
        "else:\n",
        "  print(\"To use tensorboard, please use this notebook in a Google Colab environment.\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "hUvnSpyjp0Dh"
      },
      "source": [
        "# start the training\n",
        "trainer.fit(model)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JxBiIKMlH8yv"
      },
      "source": [
        "After training for 1 epoch, exact match on the evaluation data should be around 59.2%, F1 around 70.2%."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ynCLBmAWFVsM"
      },
      "source": [
        "# Evaluation\n",
        "\n",
        "To see how the model performs, let’s run evaluation on the test dataset."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XBMCoXAKFtSd"
      },
      "source": [
        "model.setup_test_data(test_data_config=config.model.test_ds)\n",
        "trainer.test(model)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VPdzJVAgSFaJ"
      },
      "source": [
        "# Inference\n",
        "\n",
        "To use the model for creating predictions, let’s run inference on the unlabeled inference dataset."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "DQhsamclRtxJ"
      },
      "source": [
        "# # store test prediction under the experiment output folder\n",
        "output_prediction_file = f\"{exp_dir}/{INFER_PREDICTION_OUTPUT_FILE}\"\n",
        "output_nbest_file = f\"{exp_dir}/{INFER_NBEST_OUTPUT_FILE}\"\n",
        "all_preds, all_nbests = model.inference(file=INFER_FILE, batch_size=5, num_samples=INFER_NUM_SAMPLES, output_nbest_file=output_nbest_file, output_prediction_file=output_prediction_file)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "sQpRIOaM_hCQ"
      },
      "source": [
        "for _, item in all_preds.items():\n",
        "    print(f\"question: {item[0]} answer: {item[1]}\")\n",
        "#The prediction file contains the predicted answer to each question id for the first TEST_NUM_SAMPLES.\n",
        "! python -m json.tool $exp_dir/$INFER_PREDICTION_OUTPUT_FILE"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ref1qSonGNhP"
      },
      "source": [
        "If you have NeMo installed locally, you can also train the model with \n",
        "[NeMo/examples/nlp/question_answering/get_squad.py](https://github.com/NVIDIA/NeMo/blob/main/examples/nlp/question_answering/question_answering_squad.py).\n",
        "\n",
        "To run training script, use:\n",
        "\n",
        "`python question_answering_squad.py model.train_ds.file=TRAIN_FILE model.validation_ds.file=VAL_FILE model.test_ds.file=TEST_FILE`\n",
        "\n",
        "To improve the performance of the model, train with multi-GPU and a global batch size of 24. So if you use 8 GPUs with `trainer.gpus=8`, set `model.train_ds.batch_size=3`"
      ]
    }
  ]
}